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Abstract. A constructive proof of the Godel-Rosser incompleteness the- 
orem 9 has been completed using the Coq proof assistant. Some theory 
of classical first-order logic over an arbitrary language is formalized. A 
development of primitive recursive functions is given, and all primitive 
recursive functions are proved to be representable in a weak axiom sys- 
tem. Formulas and proofs are encoded as natural numbers, and functions 
operating on these codes are proved to be primitive recursive. The weak 
axiom system is proved to be essentially incomplete. In particular, Peano 
arithmetic is proved to be consistent in Coq's type theory and therefore 
is incomplete. 
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1 Introduction 

The Godel-Rosser incompleteness theorem for arithmetic states that any com- 
plete first-order theory of a nice axiom system, using only the symbols +, X, 
0, S, and < is inconsistent. A nice axiom system must contain the nine specific 
axioms of a system called NN. These nine axioms serve to define the previous 
symbols. A nice axiom system must also be expressible in itself. This last re- 
striction prevents the incompleteness theorem from applying to axioms systems 
such as the true first order statements about N. 

* * * This paper appears in the proceedings of the 18th International Conference on The- 
orem Proving in Higher Order Logics (TPHOLs 2005) 




A computer verified proof of Godel's incompleteness theorem is not new. 
In 1986 Shankar created a proof of the incompleteness of Z2, hereditarily fi- 
nite set theory, in the Boyer-Moore theorem prover JTj. My work is the first 
computer verified proof of the essential incompleteness of arithmetic. Harrison 
recently completed a proof in HOL Light |Sj of the essential incompleteness of in- 
complete theories, but has not shown that any particular theory is Zi-complctc. 
His work will be included in the next release of HOL Light. 

My proof was developed and checked in Coq 7.3.1 using Proof General under 
XEmacs. It is part of the user contributions to Coq and can be checked in Coq 

8.0 1 14| . Examples of source code in this document use the new Coq 8.0 notation. 
Coq is an implementation of the calculus of (co)inductive constructions. This 

dependent type theory has intensional equality and is constructive, so my proof is 
constructive. Actually the proof depends on the Ensembles library which declares 
an axiom of extensionality for Ensembles, but this axiom is never used. 

This document points out some of the more interesting problems I encoun- 
tered when formalizing the incompleteness theorem. My proof mostly follows the 
presentation of incompleteness given in An Introduction to Mathematical Logic 
|1()| . I referred to the supplementary text for the book Logic for Mathematics 
and Computer Science pQ to construct Godel's /3-function. I also use part of 
Caprotti and Oostdijk's contribution of Pocklington's criterion to prove the 
Chinese remainder theorem. 

This document is organized as follows. First I discuss the difficulties I had 
when formalizing classical first-order logic over an arbitrary language. This is 
followed by the definition of a language LNN and an axiom system called NN. 
Next I give the statement of the essential incompleteness of NN. Then I briefly 
discuss coding formulas and proofs as natural numbers. Next I discuss primitive 
recursive functions and the problems I encountered when trying to prove that 
substitution can be computed by a primitive recursive function. Finally I briefly 
discuss the fixed-point theorem, Rosser's incompleteness theorem, and the in- 
completeness of PA. At the end I give some remarks about how to extend my 
work in order to formalize Godel's second incompleteness theorem. 

1.1 Coq Notation 

For those not familiar with Coq syntax, here is a short list of notation 

— ->, A, \/, and ~ are the logical connectives =>, A, V, and ->. 

— A -> B, A * B, and A + B form function types, Cartesian product types, 
and disjoint union types. 

— *, +, and S are the arithmetic operations of multiplication, addition, and 
successor. 

— inl and inr are the left and right injection functions of types A -> A + B 
and B -> A + B. 

— : : , and ++ are the list operations cons, and append. 

— _ is an omitted parameter that Coq can infer itself. 

For more details see the Coq 8.0 reference manual |14j . 



2 First-Order Classical Logic 



I began by developing the theory of first order classical logic inside Coq. In 
essence Coq's logic is a formal metalogic to reason about this internal logic. 

2.1 Definition of Language 

I immediately took advantage of Coq's dependent type system by defining 
Language to be a dependent record of types for symbols and an arity function 
from symbols to N. The Coq code is: 

Record Language : Type := language 
{Relations : Set; 
Functions : Set; 

arity : Relations + Functions -> nat}. 

In retrospect it would have been slightly more convenient to use two arity func- 
tions instead of using the disjoint union type. 

This approach differs from Harrison's definition of first order terms and for- 
mulas in HOL Light [S] because HOL Light does not have dependent types. 
Dependent types allow the type system to enforce that all terms and formulas 
of a given language are well formed. 

2.2 Definition of Term 

For any given language, a Term is either a variable indexed by a natural number 
or a function symbol plus a list of n terms where n is the arity of the function 
symbol. My first attempt at writing this in Coq failed. 

Variable L : Language. 
(* Invalid definition *) 
Inductive TermO : Set := 
I varO : nat -> TermO 

I applyO : f orall (f : Functions L) (1 : List TermO) , 
(arity L (inr _ f))=(length 1) -> TermO. 

The type (arity L (inr _ f)) = (length 1) fails to meet Coq's positivity re- 
quirement for inductive types. Expanding the definition of length reveals a 
hidden occurrence of TermO which is passed as an implicit argument to length. 
It is this occurrence that violates the positivity requirement. 

My second attempt met the positivity requirement, but it had other difficul- 
ties. A common way to create a polymorphic lists of length n is: 

Inductive Vector (A : Set) : nat -> Set := 
I Vnil : Vector A 
I Vcons : f orall (a : A) (n : nat) , 
Vector A n -> Vector A (S n) . 



Using this I could have defined Term like: 
Variable L : Language. 

Inductive Terml : Set := 
I varl : nat -> Terml 
I applyl : forall f : Functions L, 

(Vector Terml (arity L (inr _ f))) -> Terml. 

My difficulty with this definition was that the induction principle generated by 
Coq is too weak to work with. 

Instead I created two mutually inductive types: Term and Terms. 

Variable L : Language. 

Inductive Term : Set := 
I var : nat -> Term 
I apply : forall f : Functions L, 

Terms (arity L (inr f)) -> Term 
with Terms : nat -> Set := 
I Tnil : Terms 
I Tcons : forall n : nat, 

Term -> Terms n -> Terms (S n) . 

Again the automatically generated induction principle is too weak, so I used the 
Scheme command to generate suitable mutual-inductive principles. 

The disadvantage of this approach is that useful lemmas about Vectors must 
be reproved for Terms. Some of these lemmas are quite tricky to prove because 
of the dependent type. For example, proving forall x : Terms 0, Tnil = x 
is not easy. 

Recently, Marche has shown me that the Terml definition would be adequate. 
One can explicitly make a sufficient induction principle by using nested Fixpoint 
functions jZj. 

2.3 Definition of Formula 

The definition of Formula was straightforward. 

Inductive Formula : Set := 

I equal : Term -> Term -> Formula 

I atomic : forall r : Relations L, Terms (arity L (inl r)) -> 
Formula 

I impH : Formula -> Formula -> Formula 

I notH : Formula -> Formula 

I forallH : nat -> Formula -> Formula. 



I defined the other logical connectives in terms of impH, notH, and f orallH. 



The H at the end of the logic connectives (such as impH) stands for "Hilbert" 
and is used to distinguish them from Coq's connectives. 

For example, the formula — iVaJo-Vcci.aJo = xi would be represented by: 

notH (forallH (forallH 1 (equal (var 0) (var 1)))) 

It would be nice to use higher order abstract syntax to handle bound variables 
by giving forallH the type (Term -> Formula) -> Formula. I would represent 
the above example as: 

notH (forallH (fun x : Term => 

(forallH (fun y : Term => (equal x y))))) 

This technique would require addition work to disallow "exotic terms" that are 
created by passing a function into forallH that does a case analysis on the 
term and returning entirely different formulas in different cases. Despeyroux et 
al. |3] address this problem by creating a complicated predicate that only valid 
formulas satisfy. 

Another choice would have been to use de Bruijn indexes to eliminate named 
variables. However dealing with free and bound variables with de Bruijn indexes 
can be difficult. 

Using named variables allowed me to closely follow Hodel's work |TJ]]. Also, in 
order to help persuade people that the statement of the incompleteness theorem 
is correct, it is helpful to make the underlying definitions as familiar as possible. 

Renaming bound variables turned out to be a constant source of work during 
development because variable names and terms were almost always abstract. In 
principle the variable names could conflict, so it was constantly necessary to 
consider this case and deal with it by renaming a bound variable to a fresh one. 
Perhaps it would have been better to use de Bruijn indexes and a deduction 
system that only deduced closed formulas. 

2.4 Definition of substituteFormula 

I defined the function substituteFormula to substitute a term for all oc- 
currences of a free variable inside a given formula. While the definition of 
substituteTerm is simple structural recursion, substitution for formulas is com- 
plicated by quantifiers. Suppose we want to substitute the term s for Xi in the 
formula Vxj.tp and i j. Suppose Xj is a free variable of s. If we naively perform 
the substitution then the occurrences of Xj in s get captured by the quantifier. 
One common solution to this problem is to disallow substitution for a term s 
when s is not substitutable for Xi in (p. The solution I take is to rename the 
bound variable in this case. 

(\/Xj.ip)[xi/s] = Vxk-{<p[xj /xk])[xi/ s) where k ^ i andec^ is not free in </? or s 

Unfortunately this definition is not structurally recursive. The second substitu- 
tion operates on the result of the first substitution, which is not structurally 
smaller than the original formula. 



Coq will not accept this recursive definition as is; it is necessary to prove 
the recursion will terminate. I proved that substitution preserves the depth of a 
formula, and that each recursive call operates on a formula of smaller depth. 

One of McBride's mantras says, "If my recursion is not structural, I am us- 
ing the wrong structure" [HI p. 241]. In this case, my recursion is not structural 
because I am using the wrong recursion. Stoughton shows that it is easier to 
define substitution that substitutes all variables simultaneously because the re- 
cursion is structural |13|. If I had made this definition first, I could have defined 
substitution of one variable in terms of it and many of my difficulties would have 
disappeared. 

2.5 Definition of Prf 

I defined the inductive type (Prf Gamma phi) to be the type of proofs of phi, 
from the list of assumptions Gamma. 

Inductive Prf : Formulas -> Formula -> Set := 
I AXM : f orall A : Formula, Prf (A : : nil) A 
I MP : f orall (Axml Axm2 : Formulas) (A B : Formula) , 
Prf Axml (impH A B) -> Prf Axm2 A -> 
Prf (Axml ++ Axm2) B 
I GEN : f orall (Axm : Formulas) (A : Formula) (v : nat) , 
In v (f reeVarListFormula L Axm) -> Prf Axm A -> 
Prf Axm (forallH v A) 
I IMP1 : f orall A B : Formula, Prf nil (impH A (impH B A)) 
I IMP2 : f orall ABC: Formula, 

Prf nil (impH (impH A (impH B C)) 

(impH (impH A B) (impH AC))) 
I CP : forall A B : Formula, 

Prf nil (impH (impH (notH A) (notH B)) (impH B A)) 
I FA1 : forall (A : Formula) (v : nat) (t : Term), 

Prf nil (impH (forallH v A) (substituteFormula L A v t)) 
I FA2 : forall (A : Formula) (v : nat), 
In v (f reeVarFormula L A) -> 
Prf nil (impH A (forallH v A)) 
I FA3 : forall (A B : Formula) (v : nat) , 
Prf nil 

(impH (forallH v (impH A B)) 

(impH (forallH v A) (forallH v B))) 
I EQ1 : Prf nil (equal (var 0) (var 0)) 
I EQ2 : Prf nil (impH (equal (var 0) (var 1)) 
(equal (var 1) (var 0))) 

I EQ3 : Prf nil 

(impH (equal (var 0) (var 1)) 

(impH (equal (var 1) (var 2)) (equal (var 0) (var 2)))) 
I EQ4 : forall R : Relations L, Prf nil (AxmEq4 R) 



I EQ5 : forall f : Functions L, Prf nil (AxmEq5 f ) . 

AxmEq4 and AxmEq5 are recursive functions that generate the equality axioms for 
relations and functions. AxmEq4 R generates 

X = X! . . . X 2n -2 — #271-1 =>" (R(x , ■ ■ -,X 2n - 2 ) R{x\, ■ ■ . , X 2n -l)) 

and AxmEq5 f generates 

Xo = Xi . .. X 2n -2 — X 2n -i ^ f(x , ■ ■ ■ ,X 2n - 2 ) — f(xi, . . . ,X 2n -i) 

I found that replacing ellipses from informal proofs with recursive functions 
was one of the most difficult tasks. The informal proof does not contain informa- 
tion on what inductive hypothesis should be used when reasoning about these 
recursive definitions. Figuring out the correct inductive hypotheses was not easy. 

2.6 Definition of SysPrf 

There are some problems with the definition of Prf given. It requires the list 
of axioms to be in the correct order for the proof. For example, if we have Prf 
Gammal (impH phi psi) and Prf Gamma2 phi then we can conclude only Prf 
Gammal++Gamma2 psi. We cannot conclude Prf Gamma2++Gammal psi or any 
other permutation of psi. If an axiom is used more than once, it must appear 
in the list more than once. If an axiom is never used, it must not appear. Also, 
the number of axioms must be finite because they form a list. 

To solve this problem, I defined a System to be Ensemble Formula, and 
(SysPrf T phi) to be the proposition that the system T proves phi. 

Definition System := Ensemble Formula. 
Definition mem := Ensembles . In. 

Definition SysPrf (T : System) (f : Formula) := 
exists Axm : Formulas, 
(exists prf : Prf Axm f , 

(forall g : Formula, In g Axm -> mem _ T g)). 

Ensemble A represents subsets of A by the functions A -> Prop, a : A is consid- 
ered to be a member of T : Ensemble A if and only if the type T a is inhabited. 
I also defined mem to be Ensembles . In so that it does not conflict with List . In. 

2.7 The Deduction Theorem 

The deduction theorem states that if r U {ip} h tp then r h if =^ ip. 

There is a choice of whether the side condition for the V-generalization rule, 
In v (f reeVarListFormula L Axm), should be required or not. If this side 
condition is removed then the deduction theorem requires a side condition on it. 
Usually all the formulas in an axiom system are closed, so the side condition on 



the V-generalization is easy to show. So I decided to keep the side condition on 
the V-generalization rule. 

At one point the proof of the deduction theorem requires proving that if 
r U {ip} h V because ip e r U {93}, then r h ip =>- ip. There are two cases 
to consider. If ip = cp then the result easily follows from the reflexivity of 
Otherwise ip G r, and therefore r h ip. The result then follows. In order to 
constructively make this choice it is necessary to decide whether ip = tp or not. 
This requires Formula to be a decidable type, and that requires the language L 
to be decidable. Since L could be anything, I needed to add hypotheses that the 
function and relation symbols are decidable types. 

— forall x y : Functions L, { x=y } + { x<>y } 

— forall x y : Relations L, { x=y } + { x<>y }. 

I used the deduction theorem without restriction and ended up using the hy- 
potheses in many lemmas. I expect that many of these lemmas could be proved 
without assuming the decidability of the language. It is hard to imagine a use- 
ful language that is not decidable, so I do not feel too bad about using these 
hypotheses in unnecessary places. 

2.8 Languages and Theories of Number Theory 

I created two languages. The first language, LNT, is the language of number 
theory and just has the function symbols Plus, Times, Succ, and Zero with 
appropriate arities. The second language, LNN, is the language of NN and has 
the same function symbols as LNT plus one relation symbol for less than, LT. 
I define two axiom systems: NN and PA. NN and PA share six axioms. 

1. Vxo.—>Sxo = 

2. Vx -Vxi.(Sxo — Sxi x n — x\) 

3. \/x .Xq + = x 

4. Vxo-Vxi.iEo + Sxi — S(x n + x\) 

5. Vxq.xq x — 

6. Vxo-Vxi.xo X Sxi — (xo X x{) + x n 

NN has three additional axioms about less than. 

1. Va^o-^^o < 

2. Vxo.Vxi.(xo < Sxi (xo — Xi V x < x\)) 

3. Va:o.Va:i.(a;o < x\ V x — x\ V x\ < xq) 

PA has an infinite number of induction axioms that follow one schema. 

1. (schema) Vxi 1 . . . . Vxj„.y[xj/0] =^ Vxj.(<p =^ ip[xj / Sxj]) =^ Wxj.ip 

The , . • • , Xi n are the free variables of Vxj.ip. The quantifiers ensure that all 
the axioms of PA are closed. 

Because NN is in a different language than PA, a proof in NN is not a 
proof in PA. In order to reuse the work done in NN, I created a function 



called LNN2LNTJormula to convert formulas in LNN into formulas in LNT by 
replacing occurrences of to < t\ with (3x2-Xq + (SX2) = Xi)[xo/to, Xi/ti\— 
ip[xo/to, Xi/ti] is the simultaneous substitution of to for Xo and t\ for x%. Then 
I proved that if NN h ip then PA h LNN2LNTJormula(<p). 

I also created the function natToTerm : nat -> Term to return the closed 
term representing a given natural number. In this document I will refer to this 
function as r . n , so r (T = 0, r l n = 50, etc. 

3 Coding 

To prove the incompleteness theorem, it is necessary for the inner logic to reason 
about proofs and formulas, but the inner logic can only reason about natural 
numbers. It is therefore necessary to code proofs and formulas as natural num- 
bers. 

Godel's original approach was to code a formula as a list of numbers and 
then code that list using properties from the prime decomposition theorem^]. 
I avoided needing theorems about prime decomposition by using the Cantor 
pairing function instead. The Cantor pairing function, cPair, is a commonly 
used bijection between N x N and N. 



All my inductive structures were easy to recursively encode. I gave each con- 
structor a unique number and paired that number with the encoding of all its 
parameters. For example, I defined codeFormula as: 

Fixpoint codeFormula (f : Formula) : nat := 
match f with 

I fol. equal tl t2 => cPair (cPair (codeTerm tl) (codeTerm t2)) 
I fol.impH fl f2 => 

cPair 1 (cPair (codeFormula fl) (codeFormula f2)) 
I fol.notH fl => cPair 2 (codeFormula fl) 
I fol.forallH n fl => cPair 3 (cPair n (codeFormula fl)) 
I fol. atomic R ts => cPair (4+(codeR R) ) (codeTerms _ ts) 
end. 

where codeR is a coding of the relation symbols for the language. 
I will use r ip n for r codeFormula ip n and r t~ 1 for r codeTerm t 1 . 

4 The Statement of Incompleteness 

The incompleteness theorem states the essential incompleteness of NN, meaning 
that for every axiom system T such that 



a+b 




i=l 



- NN C T 



— T can represent its own axioms 

— T is a decidable set 

then there exists a sentence <p such that if T h or T I 199 then T is inconsistent. 

The theorem is only about proofs in LNN, the language of NN. This statement 
does not show the incompleteness of theories that extend the language. 

In Coq the theorem is stated as as: 

Theorem Incompleteness 

: forall T : System, 

Included Formula NN T -> 
RepresentsInSelf T -> 
DecidableSet Formula T -> 
exists f : Formula, 
Sentence f A 

(SysPrf T f \/ SysPrf T (notH f ) -> Inconsistent LNN T) . 

A System is Inconsistent if it proves all formulas. 

Definition Inconsistent (T : System) := 
forall f : Formula, SysPrf T f. 

A Sentence is a Formula without any free variables. 

Definition Sentence (f : Formula) := 

forall v : nat , " In v (f reeVarFormula LNN f ) . 

A DecidableSet is an Ensemble such that every item either belongs to the 
Ensemble or docs not belong to the Ensemble. This hypothesis is trivially true 
in classical logic, but in constructive logic I needed it to prove the strong con- 
structive existential quantifier in the statement of incompleteness. 

Definition DecidableSet (A : Type) (s : Ensemble A) := 
forall x : A, mem A s x \/ mem A s x. 

The RepresentsInSelf hypothesis restricts what the System T can be. The 
statement of essential incompleteness normally requires T be a recursive set. 
Instead I use the weaker hypothesis that the set T is expressible in the system 
T. 

Given a system T extending NN and another system U along with a formula 
ipu with at most one free variable Xi, we say ipu expresses the axiom system U 
in T if the following hold for all formulas ip. 

1. if V e U then T h (pu[xi/ r i/> n ] 

2. ]£ip#U then T h ->ipu[ Xi / r ip n ] 

U is expressible in T if there exists a formula ipu such that ipu expresses the 
axiom system U in T. 

In Coq I write the statement T is expressible in T as 



Definition RepresentsInSelf (T : System) := 
exists rep : Formula, exists v : nat , 

(forall x : nat, In x (f reeVarFormula LNN rep) -> x = v) A 
(forall f : Formula, 

mem Formula T f -> 

SysPrf T (substituteFormula LNN rep v 

(natToTerm (codeFormula f)))) A 

(forall f : Formula, 

mem Formula T f -> 
SysPrf T (notH (substituteFormula LNN rep v 

(natToTerm (codeFormula f))))). 

This is weaker than requiring that T be a recursive set because any recursive set 
of axioms T is expressible in NN. Since T is an extension of NN, any recursive 
set of axioms T is expressible in T. 

By using this weaker hypothesis I avoid defining what a recursive set is. Also, 
in this form the theorem could be used to prove that any complete and consistent 
theory of arithmetic cannot define its own axioms. In particular, this could be 
used to prove Tarski's theorem that the truth predicate is not definable. 

5 Primitive Recursive Functions 

A common approach to proving the incompleteness theorem is to prove that 
every primitive recursive function is representable. Informally an n-ary function 
/ is representable in NN if there exists a formula <p such that 

1. the free variables of ip are among x 0} ■ ■ . , x n . 

2. for all oi, . . . , a n : N, 

NN h (ip => x = r f(a 1 ,.. .,a n y)[x 1 / r a 1 ~ 1 ,.. .,x n / r a n n ] 

I defined the type PrimRec n as: 

Inductive PrimRec : nat -> Set := 
I succFunc : PrimRec 1 
I zeroFunc : PrimRec 

I projFunc : forall n m : nat, m < n -> PrimRec n 
I composeFunc : 

forall (n m : nat) (g : PrimRecs n m) (h : PrimRec m) , 
PrimRec n 
I primRecFunc : 

forall (n : nat) (g : PrimRec n) (h : PrimRec (S (S n))), 
PrimRec (S n) 
with PrimRecs : nat -> nat -> Set := 
I PRnil : forall n : nat, PrimRecs n 
I PRcons : forall n m : nat, 

PrimRec n -> PrimRecs n m -> PrimRecs n (S m) . 



PrimRec n is the expression of an n-ary primitive recursive function, but it is 
not itself a function. I defined evalPrimRec : forall n : nat, PrimRec n 
-> naryFunc n to convert the expression into a function. Rather than work- 
ing directly with primitive recursive expressions, I worked with particular Coq 
functions and proved they were extensionally equivalent to the evaluation of 
primitive recursive expressions. 

I proved that every primitive recursive function is representable in NN. This 
required using Godel's /3-function along with the Chinese remainder theorem. 
The /3-function is a function that codes array indexing. A finite list of numbers 
clq, . . . , a n is coded as a pair of numbers (x, y) and (3(x, y, i) = ai. The /3-function 
is special because it is defined in terms of plus and times and is non-recursive. 
The Chinese remainder theorem is used to prove that the /3-function works. 

I took care to make the formulas representing the primitive recursive func- 
tions clearly £\ by ensuring that only the unbounded quantifiers are existential; 
however, I did not prove that the formulas are S\ because it is not needed for 
the first incompleteness theorem. Such a proof could be used for the second 
incompleteness theorem |12|. 

5.1 codeSubFormula is Primitive Recursive 

I proved that substitution is primitive recursive. Since substitution is defined 
in terms of Formula and Term, it itself cannot be primitive recursive. Instead I 
proved that the corresponding function operating on codes is primitive recursive. 
This function is called codeSubFormula and I proved it is correct in the following 
sense. 

codeSubFormula^^" 1 , i, r s n ) = r ip[xi/s]~ 1 

Next I proved that it is primitive recursive. This proof is very difficult. The 
problem is again with the need to rebind bound variables. Normally one would 
attempt to create this primitive recursive function by using course-of- values re- 
cursion. Course-of- values recursion requires all recursive calls have a smaller code 
than the original call. Renaming a bound variable requires two recursive calls. 
Recall the definition of substitution in this case: 

(\/Xj.ip)[xi/s] = Vxk.((p[xj /xk])[xi/ s] wherefc ^ i and x & is not free in tp or s 

If one is lucky one might be able to make the inner recursive call. But there is 
no reason to suspect the input to the second recursive call, ip[xj/xk], is going 
to have a smaller code than the original input, Vxj.(p. 

If I had used the alternative definition of substitution, where all variables 
are substituted simultaneously, there would still be problems. The input would 
include a list of variable and term pairs. In this case a new pair would be added 
to the list when making the recursive call, so the input to the recursive call could 
still have a larger code than the input to the original call. 

It seems that using course-of-values recursion is difficult or impossible. In- 
stead I introduce the notion of the trace of the computation of substitution. 
Think of the trace of computation as a finite tree where the nodes contain the 



input and output of each recursive call. The subtrees of a node are the traces 
of the computation of the recursive calls. This tree can be coded as a number. 
I proved that there is a primitive recursive function that can check to see if a 
number represents a trace of the computation of substitution. 

The key to solving this problem is to create a primitive recursive function 
that computes a bound on how large the code of the trace of computation can 
be for a given input. With this I created another primitive recursive function 
that searches for the trace of computation up to this bound. Once the trace is 
found — I proved that it must be found — the function extracts the result from 
the trace and returns it. 

5.2 checkPrf is Primitive Recursive 

Given a code for a formula and a code for a proof, the function checkPrf returns 

if the proof does not prove the formula, otherwise it returns one plus the code of 
the list of axioms used in the proof. I proved this function is primitive recursive, 
as well as proving that it is correct in the sense that for every proof p of <p 
from a list of axioms r, checkPrf ( r <p~', r p~ l ) = 1 + T -1 ; and for all n, m : N if 
checkPrf (n, m) ^ then there exists ip, T, and some proof p of ip from r such 
that r ip n = n and r p n = m. 

For any axiom system U expressible in T, I created the formulas codeSysPrf 
and codeSysPf. codeSysPrf [x / r n~ l , Xi/ r m~ 1 } is provable in T if m is the code 
of a proof in U of a formula coded by n. codeSysPf [x n / r n~ l ] is provable in T if 
there exists a proof in U of a formula coded by n. 

codeSysPrf and codeSysPf are not derived from a primitive recursive func- 
tions because I wanted to prove the incompleteness of axiom systems that may 
not have a primitive recursive characteristic function. 

6 Fixed Point Theorem and Rosser's Incompleteness 
Theorem 

The fixed point theorem states that for every formula ip there is some formula 
ip such that 

NN h i/> O p[x t / r ^] 

and that the free variables of ip are that of <p less Xj. 

The fixed point theorem allows one to create "self-referential sentences". I 
used this to create Rosser's sentence which states that for every code of a proof 
of itself, there is a smaller code of a proof of its negation. The proof of Rosser's 
incompleteness theorem requires doing a bounded search for a proof, and this 
requires knowing what is and what is not a proof in the system. For this reason, 

1 require the decidability of the axiom system. Without a decision procedure for 
the axiom system, I cannot constructively do the search. 



6.1 Incompleteness of PA 

To demonstrate the incompleteness theorem I used it to prove the incompleteness 
of PA. I created a primitive recursive predicate for the codes of the axioms of 
PA. Coq is sufficiently powerful to prove the consistency of PA by proving that 
the natural numbers model PA. 

One subtle point is that Coq's logic is constructive while the internal logic 
is classical. One cannot interpret a formula of the internal logic directly in Coq 
and expect it to be provable if it is provable in the internal logic. Instead I use a 
double negation translation of the formulas. The translated formula will always 
hold if it holds in the internal logic. 

The consistency of PA along with the expressibility of its axioms and the 
translations of proofs from NN to PA allowed me to apply Rosser's incomplete- 
ness theorem and prove the incompleteness of PA — there exists a sentence ip 
such that neither PA h ip nor PA I 195. 

Theorem PAIncomplete : 
exists f : Formula, 

(forall v : nat , In v (f reeVarFormula LNT f)) A 
~ (SysPrf PA f \/ SysPrf PA (notH f)). 

7 Remarks 

7.1 Extracting the Sentence 

Because my proof is constructive, it is possible, in principle, to compute this 
sentence that makes PA incomplete. This was not done for two reasons. The 
first reason is that the existential statement lives in Coq's Prop universe, and 
Coq's only extracts from its Set universe. This was an error on my part. I should 
have used Coq's Set existential quantifier; this problem would be fairly easy to 
fix. The second reason is that the sentence contains a closed term of the code 
of most of itself. I believe this code is a very large number and it is written in 
unary notation. This would likely make the sentence far to large to be actually 
printed. 

7.2 Robinson's System Q 

The proof of essential incompleteness is usually carried out for Robinson's system 
Q. Instead I followed Hodel's development ^Oj and used NN. Q is PA with the 
induction schema replaced with Vxo.3xi.(a;o = V Xo — Sxi). All of NN 
axioms are TTi whereas Q has the above II2 axiom. Both axiom systems are 
finite. 

Neither system is strictly weaker than the other, so it would not be possible 
to use the essential incompleteness of one to get the essential incompleteness of 
the other; however both NN and Q are sufficiently powerful to prove a small 
number of needed lemmas, and afterward only these lemmas are used. If one 
abstracts my proof at these lemmas, it would then be easy to prove the essential 
incompleteness of both Q and NN. 



7.3 Comparisons with Shankar's 1986 Proof 

It is worth noting the differences between this formalization of the incompleteness 
theorem and Shankar's 1986 proof in the Boyer-Moore theorem prover. The most 
notable difference is the proof systems. In Coq the user is expected to input the 
proof, in the form of a proof script, and Coq will check the correctness of the 
proof. In the Boyer-Moore theorem prover the user states a series of lemmas and 
the system generates the proofs. However, using the Boyer-Moore proof system 
requires feeding it a "well-chosen sequence of lemmas" ^2 P- x h], so it would 
seem the information being fed into the two systems is similar. 

There are some notable semantic differences between Shankar's statement 
of incompleteness and mine. His theorem only states that finite extensions of 
Z2, hereditarily finite set theory, are incomplete, whereas my theorem states 
that even infinite extensions of NN are incomplete as long as they are self- 
representable. Also Shankar's internal logic allows axioms to define new relation 
or function symbols as long as they come with the required proofs of admissibil- 
ity. Such extensions are conservative over Z2, but no computer verified proof of 
this fact is given. My internal logic does not allow new symbols. Finally, I prove 
the essential incompleteness of NN, which is in the language of arithmetic. With- 
out any set structures the proof is somewhat more difficult because it requires 
using Godel's /3-function. 

One of Shankar's goals when creating his proof was to use a proof system 
without modifications. Unfortunately he was not able to meet that goal; he ended 
up making some improvements to the Boyer-Moore theorem prover. My proof 
was developed in Coq without any modifications. 

7.4 Godel's Second Incompleteness Theorem 

The second incompleteness theorem states that if T is a recursive system ex- 
tending PA — actually a weaker system could be used here — and T h Con^ then 
T is inconsistent. Con^ is some reasonable formula stating the consistency of T, 
such as -iPrT( r O = S0 n ), where Pry is the provability predicate codeSysPf for 
T. 

If I had created a formal proof in PA, I would have 
hpA "Godel's first incompleteness theorem" . This could then be me- 
chanically transformed to create another formal proof in PA that 
hpA (PA h "Godel's first incompleteness theorem"). The reader can verify 
that the second incompleteness theorem follows from this. Unfortunately I 
have only shown that hc'oq "Godel's first incompleteness theorem" , so the 
above argument cannot be used to create a proof of the second incompleteness 
theorem. 

Still, this work can be used as a basis for formalizing the second incom- 
pleteness theorem. The approach would be to formalize the Hilbert-Bernays-L6b 
derivability conditions: 



1. if PA h ip then PA h Pr PA ( r <^ n ) 



2. PA h Pr PA ( r <^) => Prp A ( r Prp A ( r ^T) 

3. PA h Pr PA ( r <^ => => Pr PA ( r <^) => Pr PA ( r V n ) 



The second condition is the most difficult to prove. It is usually proved by hrst 
proving that for every Si sentence <p, PA h ip PrpAC"^" 1 )- Because I made 
sure that all primitive recursive functions are representable by a S\ formula, 
it would be easy to go from this theorem to the second Hilbert-Bernays-L6b 
condition. 

8 Statistics 

My proof, excluding standard libraries and the library for Pocklington's criterion 
2 , consists of 46 source files, 7 036 lines of specifications, 37 906 lines of proof, 
and 1 267 747 total characters. The size of the gzipped tarball (gzip -9) of all 
the source files is 146 008 bytes, which is an estimate of the information content 
of my proof. 
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